Towards Data Mining Operators in Database Systems: Algebra and Implementation
نویسندگان
چکیده
The KDD process is a non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data. This process comprises several steps which are invoked and parametrized in an interactive and iterative manner. A uniform framework for different kinds of patterns and operators is needed to support KDD efficiently and in an integrated way. Furthermore, because of large data sets it is necessary to scale up mining algorithms in order to achieve fast user support. One task of scaling data mining algorithms is the integration of KDD operators in database management systems. Two aspects of supporting KDD are addressed in this paper. First, a uniform framework is proposed that is based on constraint database concepts as well as interestingness values of patterns. Different operators are defined uniformly in that model. Second, DBMS-coupled implementations of selected operators for decision tree mining are discussed.
منابع مشابه
Geo-Relational Algebra: A Model and Query Language for Geometric Database Systems
The user's conceptual model of a database system for geometric data should be simple and precise: easy to learn and understand, with clearly defined semantics, expressive: allow to express with ease all desired query and data manipulation tasks, efficiently implementable. To achieve these goals we propose to extend relational database management systems by integrating geometry at all levels: At...
متن کاملEfficiency score assessment of Iranian Mining, Wood and Textile Industries
The Iranian Environment Protection Agency (IEPA) in collaboration with Iranian Industries Organization (IIO) need to design a relevant database for the industries information based on the initial screening of Iranian Evaluator Team (IET) in certain clusters. However, we aware of this fact that all industrial projects should go through the Environmental Impact Assessment (EIA) after and along wi...
متن کاملEquipAsso: An Algorithm based on New Relational Algebraic Operators for Association Rules Discovery
The task of search for interesting relationships among data has been always an research focus in data mining. The overall performance of mining association rules is determined by the discover the large itemsets, i.e., the sets of itemsets that have their support above a pre-determined minimum support . The algorithms proposed for association rules show different approaches to generate all large...
متن کاملInteractivity, Scalability and Resource Control for Efficient KDD Support in DBMS
The conflict between resource consumption and query performance in the data mining context often has no satisfactory solution. This not only stands in sharp contrast to the need of the analysts for interactive response times, but also makes the seamless integration of data mining operators into common multiuser database systems a difficult and (so far) not very prosperous task. We believe that ...
متن کاملA Better Approach for Horizontal Aggregations in SQL Using Data Sets for Data Mining Analysis
To analyzing the data efficiently in Data mining systems are widely using datasets with columns in horizontal tabular layout. Generally preparing a data set is the more complex task in a data mining project, require many complex SQL queries, aggregating columns and joining tables. Conventional RDBMS usually manage tables with vertical form. Aggregated columns in a horizontal tabular layout retu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002